首页> 外文OA文献 >Learning to combine multiple string similarity metrics for effective toponym matching

【2h】

Learning to combine multiple string similarity metrics for effective toponym matching

机译：学习组合多个字符串相似性指标以进行有效的地名匹配

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Several tasks related to geographical information retrieval and to the geographical information sciences involve toponym matching, that is, the problem of matching place names that share a common referent. In this article, we present the results of a wide-ranging evaluation on the performance of different string similarity metrics over the toponym matching task. We also report on experiments involving the usage of supervised machine learning for combining multiple similarity metrics, which has the natural advantage of avoiding the manual tuning of similarity thresholds. Experiments with a very large dataset show that the performance differences for the individual similarity metrics are relatively small, and that carefully tuning the similarity threshold is important for achieving good results. The methods based on supervised machine learning, particularly when considering ensembles of decision trees, can achieve good results on this task, significantly outperforming the individual similarity metrics.

机译：与地理信息检索和地理信息科学有关的若干任务涉及地名匹配，即匹配共享共同指称的地名的问题。在本文中，我们介绍了对地名匹配任务中不同字符串相似性度量的性能进行广泛评估的结果。我们还报告了涉及使用监督机器学习来组合多个相似性指标的实验，这具有避免手动调整相似性阈值的自然优势。使用非常大的数据集进行的实验表明，各个相似度指标的性能差异相对较小，并且仔细调整相似度阈值对于获得良好结果非常重要。基于监督机器学习的方法，特别是在考虑决策树集成时，可以在此任务上取得良好的结果，大大优于单个相似性指标。

著录项

作者
Santos, Rui; Murrieta-Flores, Patricia; Martins, Bruno;
展开▼
作者单位

展开▼
年度 2017
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. Learning to combine multiple string similarity metrics for effective toponym matching [J] . Santos Rui, Murrieta-Flores Patricia, Martins Bruno International journal of digital Earth . 2018,第7a9期

机译：学习结合多个字符串相似度量，以了解有效的顶层匹配
2. Combining string and phonetic similarity matching to identify misspelt names of drugs in medical records written in Portuguese [J] . Hegler Tissot, Richard Dobson Journal of Biomedical Semantics . 2019,第1aSupplement期

机译：结合字符串和语音相似性匹配，以识别葡萄牙语中撰写的医疗记录中药物的错过胶片名称
3. A Comparative Study for String Metrics and the Feasibility of Joining them as Combined Text Similarity Measures [J] . Safa S. Abdul-Jabbar, Loay E. George Aro: The scientific journal of Koya University . 2017,第2期

机译：字符串度量标准的比较研究以及将其作为组合文本相似性度量进行连接的可行性
4. Combining Multiple Similarity Metrics for Corner Matching [C] . Hatem Khater, Farzin Deravi Image Processing: Algorithms and Systems V; Proceedings of SPIE-The International Society for Optical Engineering; vol.6497; Electronic Imaging Science and Technology . 2007

机译：结合多个相似度指标进行角点匹配
5. User profile relationships using a generalized string similarity metric in social networks. [D] . Dabeeru, Vasavi Akhila. 2014

机译：在社交网络中使用广义字符串相似性度量的用户个人资料关系。
6. Combining string and phonetic similarity matching to identify misspelt names of drugs in medical records written in Portuguese [O] . Hegler Tissot, Richard Dobson 2019

机译：结合字符串和语音相似性匹配以识别葡萄牙语书写的医疗记录中药物的拼写错误名称
7. Combining string and phonetic similarity matching to identify misspelt names of drugs in medical records written in Portuguese [O] . Hegler Tissot, Richard Dobson 2019

机译：结合字符串和语音相似性匹配，以识别葡萄牙语中撰写的医疗记录中药物的错过胶片名称

Learning to combine multiple string similarity metrics for effective toponym matching

摘要

著录项

相似文献

相关主题

期刊订阅